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(12.) General conclusions 


(1.) Introduction. 


In dealing with the problem of the relationship of attributes, not capable of 
quantitative measurement, it has been usual to classify the two attributes into a 
number of groups, Ay, A,, A;,... A, and B,, B,, B,,...B, In this manner a table 
bas been formed containing s columns and ¢ rows, or s X t compartments. The total 
frequency of the population, or of the “universe” under consideration, to use the 
logician’s phrase, is then distributed into sub-groups corresponding to these s x ¢ 
compartments. In simple cases of association, as in that of the presence of the 
vaccination cicatrix and the recovery from an attack of smallpox, s and ¢ are both 
equal to two, and we have a simple four-fold division of the universe. In other cases 
we have higher numbers, as when we classify the human eye into eight colour classes 
and correlate these classes with six or more classes for hair colour. We may even 
run up to as many as 18 to 25 classes for each attribute when we table the coat 
colours of thoroughbred horses or pedigree dogs in the case of pairs of blood relatives, 
A 2 
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Hitherto, in order to obtain a measure of the degree of correlation or association, we 
have proceeded on the assumption that it was necessary to arrange the system of 
classes like A,, As,... A, in some order, which corresponded to a real quantitative 
scale in the attribute, although we were unable to use this scale directly. Thus one 
arranged eye-colours in what appeared to correspond to a scale of varying amounts 
of orange pigment ; the coat colours of horses were arranged in an order corresponding 
fairly to what an artist would call their “value.” I even analysed hair tints by 
photographic processes. In all such cases the order seemed of vital importance. 
Once this order was settled, the methods of my memoir* on the correlation of 
characters not quantitatively measurable could be applied—the actual scale corre- 
sponding to the classification could be deduced, and we were able, on the assumption 
of normal frequency, to actually plot the regression lines for the correlation of a 
variety of attributes.t The conception, however, of order in the classification was 
at times very hampering. Take three broad classes like those for human temper— 
quick tempered, good natured, and sullen ; it is difficult to grasp the exact meaning 
of a quantitative scale at the basis of this classification, and it is not obvious that the 
right order is necessarily that with good-natured in the middle. Or, again, take the 
case of human hair ; omitting the brown reds, we can get a practically continuous series 
of shades from jet black to flaxen, and from flaxen with increasing red up to the 
deepest reds. Only the brown reds come in and upset the system! We seem, 
therefore, forced to take a double scale, first one of black, and then one of red 
pigment. Or, again, take the coat colour of greyhounds; these are classified into as 
many as 40 fairly narrow groups, and we can arrange these groups in ascending 
order of red, or black, or other pigmentation. We have more than one possible scale. 

Now in recent work on such things as temper in man, eye colour in man, and hair 
colour in man or other animals, I have proceeded to arrange my groups in two or 
three different orders, and to calculate the correlation on the basis of these 
different orders. The results for the different orders came out in rather striking 
agreement, and the first sort of conclusion that one was tempted to draw was, for 
example, that the inheritance of pigmentation was strikingly alike for all pigments. 
But the agreement was in some cases far closer than one is accustomed to find when 
one compares the inheritance of directly measurable characters, and I soon became 
convinced that owing to some important theoretical law hitherto overlooked, the 
order of the groups by which we classify our attributes is a matter of no importance 
when we are determining correlation. The group order is all important for variation, 
it has practically no influence on correlation. We may put sullen tempers where we 
please in regard to quick and good-natured ; we may place the shades of red hair at 
either end of the hair scale or in the middle, and the inheritance coefficient will come 

* <Phil. Trans.,’ A, vol: 195, pp: 1-47. 

+ For example, for health and ability and for the correlation of the psychical and physical characters, 
see the “ Fourth Annual Huxley Lecture,” ‘Journal of the Anthropological Institute,’ vol. 33, pp. 194-195. 
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out nearly the same in value. Nay, we may go further, and classify finger prints 
like Mr. Garon into “ tents,” “arches,” ‘‘ whorls,” ‘“ croziers,” &c., &c., and still be 
able to find a numerical value of the degree of resemblance between two blood 
relatives, although any arrangements of such groups into a possible quantitative 
scale may be inconceivable. The object of this present paper is to deal with this 
novel conception of what I have: termed contingency, and to see its relation to our 
older notions of association and normal correlation. The great value of the idea ot 
contingency for economic, social, and biometric statistics seems to me to lie in the 
fact that it frees us from the need of determining scales before classifying our 
attributes. I shall endeavour to illustrate the importance of this freedom in the 
illustrations which follow the theoretical treatment of the subject. 


(2.) On the Conception of Contingency. 


In mathematical treatises on algebra a definition is usually given of independent 
probability. If p be the probability of any event, and g the probability of a second 
event, then the two events are said to be independent, if the probability of the 
combined event be p x g. Now let A be any attribute or character and let it be 
classified into the groups Ay, Ay,... A,, and let the total number of individuals 
examined be N, and let the numbers which fall into these groups be 1, %,... n, 
respectively. Then the probability of an individual falling into one or other of these 
groups is given by 1,/N, 2/N,...n,/N respectively. Now suppose the same 
population to he classified by any other attribute into the groups B,, B,,... B,, and 
the group frequencies of the N individuals to be m,, m9, ... my respectively. The 
probability of an individual falling into these groups will be respectively m,/N, m,./N, 
ms/N,...7y/N. Accordingly the number of combinations of B, with A, to be 
expected on the theory of independent probability if N pairs of attributes are 
examined is 


n m Ni. ™ 
Nx BP YT om 

Let the number actually observed be n,,.. Then, allowing for the errors of random 
sampling, 


NM, 
UU Ge sere = Nay — Vy 


is the deviation from independent probability in the occurrence of the groups A,, B,. 
Clearly the total deviation of the whole classification system from independent 
probability must be some function of the n,, — v,, quantities for the whole table. I 
term any measure of the total deviation of the classification from independent 
probability a measure of its contingency. Clearly the greater the contingency, the 
greater must be the amount of association or of correlation between the two 
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attributes, for such association or correlation is solely a measure from another 
standpoint of the degree of deviation from independence of occurrence. 

Now it must be quite clear that if we make our measurement of contingency any 
function whatever of such quantities aS 2» — Vm, 1ts magnitude will be absolutely 
independent of the order of classification, 7.¢., its value will be unchanged if we 
re-arrange the A’s and the B’s in any manner whatever. ‘This is the fundamental 
gain of this new conception of contingency. But precisely as we can measure 
position or acceleration in a great variety of ways, so it is possible to measure 
contingency. We must try to select out of these ways those which: (a) bring 
contingency into line with the customary notions of correlation and association ; and 
(b) permit of not too laborious calculations leading to the required measure. 

We will consider these points at some length. I have shown in a paper,* “ On 
Deviations from the Probable in a Correlated System of Variables,” that if m’,, 
m’',,... mm, be any system of observed frequencies and m,. m,,...m, be any system 
of theoretical frequencies known @ priori, then if 


tf 2 
ma} from g = 0 ton 


be calculated, we can deduce a quantity P from x? which is the probability that in 
any trial a system mm”), m’’,,...m", of observed frequencies will occur, which 
deviates more from m,, m3,...m, than the actually observed system does. Tables 
have been worked out by Mr. Patin Experton giving the value of P, for a 
considerable range of values of y* and 7, and have been published in ‘ Biometrika.’ t 

Now it will be obvious that if we want to measure contingency, we really want to 
measure the deviation of the observed results from independent probability, and 
therefore if we take m,, 73,...m, to correspond to the system v,, and m’,, m’s,... 1m’, 
to correspond to the actually observed system 1,,, 


Set open Ve 
2 SS 


Y= 
Vuy 


will be a proper quantity to calculate, and P would measure how far the observed 
system is or is not compatible with a basis of independent probability. If P be large 
the chances are in favour of the system arising from independent probability ; if P be 
small there is certainly association between the attributes. Hence 1 — P would bea 
proper measure of the contingency. I propose to call 1 — P the contingency grade. 
Further, it is convenient to have a name for a function closely related to x7. I shall 
call 

adie x’/N We nee aot (ii. ) 


the mean square contingency. 


* «Phil. Mag.,’ July, 1900, pp. 157-175. 
Y Vol. Ly po155, 
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It will be seen that, in the method by which we have approached the problem, we 
have not had to consider the question of the sign of the contingency like iu — ry 
our mean square contingency is based on a summation of squares extending to all the 
s X ¢ compartments of the table. But if we treat now of quantities like ny, — vx, 
their total sum must be zero, since for the whole table 
=) (a Ros RS EE 


Let us suppose that the symbol > refers to a summation of all the positive 
contingencies, and let 


eet Ne oe SS Gi), 


then y shall be spoken of as the mean contingency. Clearly any functions of either 
¢ or f would serve to measure the contingency. We shall be guided in our choice 
of such functions by considering what are the values of ¢” and w in the case of normal 
correlation. 


(3.) On the Relation between Mean Square Contingency and Normal Correlation. 


Let # and y denote the deviations from their respective means of two characters or 
attributes, of which o,, o, are the standard deviations and 1 is the correlation. Then 
if we assume a normal distribution of frequency, 2, dx dy would be the frequency ot 
individual pairs falling between x and x + dx, y and y + dy, where 


ene Gaiety), 


27070, 


on the assumption of independent probability, and z dx dy, where 


N (2a +) 


: 3,45 
= ————== @ *1-72\e2 oroy ay? 
F 9 2 a@y y 
Qa/ 1 — 1O,0y 


on the assumption of contingent probability. 
We then have at once 


og | (28a dy — x du ur | af (z — 2%) } 
p =8| Nz, Sze 8y =] Newer! f 


m% 


and we have only to insert the values of z and z, given by (iv.) and (v.), and integrate 
all over the plane of «, y, to find the mean square contingency. 
Now, if ac > b®, we know that 


1 ie Bag —}(ax?—2bxry + cy?) 1 | s 
== f | (cae 4 dx dy — a 4 i z 7 a (vi 
ae a / ae ae b? 
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This is all we need, for if «= o,4’, y=0,j/': 


2 al aa Re a ‘ Fete ips) 
¢? = — — 2+ %) da’ dy 
IN ga 2 ; 
ot NFO 
; +o po 1 elt are'y! | gl tr? 
ee ee | ee aia ad fy! dy! 
27 1 aoe ye “_—D ee : 
, +n P+ _ apres Wey Bir eee giv 
re! ey 5 f | é tina wet} da! dy’ 
wk seat etic fer Ne f= : 
+O [+o Are , ; 
+ [ f eter” da’ dy | 
1 / 1 2 1 aa 
a ee ee Le) 
L— 7% A Oa) 4r* /1—r ay 1 a 
ee | (1 —r) (1 —ry re! —7)? 
1 : re 
= -— 2+ 1 =—— 
1-—r x5 1—7 


Thus the mean square contingency is simply 7?/(1—r?). Or, 


nat ¢* aS 
rst Ti dt") nee 


Thus the relationship between mean square contingency and correlation in the case: 


of normal frequency is of an extremely simple character. 
We see at once :— 


(i.) That since the mean square contingency is absolutely independent of the 
arrangement of our classes, the coefficient of correlation is also entirely 
independent of the arrangement of our classes on the basis of any assumed 
order or scale. 

(ii.) Provided our classes are sufficiently small to allow of us legitimately 
replacing by groupings over small areas the theoretical integrations, the 
coefficient of correlation can be found from the mean square contingency. 


We have thus an entirely new method of finding correlation in the case of 
quantitatively non-measurable characters. It assumes, however, that our classification- 
groups are sufficiently numerous and their contents sufficiently small to justify us im 
supposing that the contingency has reached a definite limit. Clearly in working in 
the future by the contingency method, we shall have to adopt rather more numerous 
classes, and they should not contain too irregular proportions of individuals, but we 
can then afford to drop any question of scale or order of grouping. 

It may be asked whether this method of deriving the correlation from the 
contingency cannot replace the earlier method of deducing the correlation by the 
fourfold division of the material. The answer is that in some cases it can do so very 
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advantageously, but it is very far from doing so in all. The contingency found from 
a fourfold table is a pérfectly real and very proper measure of the deviation of its 
material from independent probability. But if this mean square contingency be 
substituted in equation (viil.), tt will not give us the correlation. The proper mean 
square contingency to give us the correlation must be based on a sufficiently large 
number of classes. When, however, we take, say, 20 classes for each attribute, we 
have 400 terms to deal with in calculating ¢°*, and although the result might then 
possibly give a more accurate value for the correlation than that found from a fourfold 
division, yet the labour of determining it is far greater and may be excessive. 
Further, the simple classification into two or three groups may be all we are able to 
make at all, or all we can conveniently make. Hence the new conception of 
contingency, while illuminating the whole subject—especially as demonstrating that 
the correlation is independent of scale or grouping, does not do away with the older 
method of the fourfold division. I propose to call the expression 


Je 
Tee 
the first coefficient of contingency. 

We note that. with small enough classes the coefficient of contingency becomes the 
coefficient of correlation. Accordingly, with a view of lessening the number of 
coefficients in use, I adopt the following convention: Any expression or function of 
either the mean square contingency (#*) or the mean contingency (ws) (or indeed of 
any other measure of the contignency), which, when the grouping is sufficiently small, 
is theoretically equal to the coefficient of correlation—on the hypothesis of normal 
frequency—shall be termed a coefficient of contingency. All such coefficients of 
contingency must, on the same hypothesis, become equal on a sufficiently small 
grouping, and they will scarcely differ widely from each other when the frequency is 
not absolutely normal and the grouping is merely moderately small. These points 
will be illustrated later. 


(4.) On the Relation of Mean Contingency to Normal Correlation. 


A great deal of the labour of finding either the coefficient of contingency or the 
coefficient of correlation by the method of mean square contingency when the groups 
are numerous, depends upon the squaring of the contingencies and dividing by the 
frequency to be expected on the basis of independent probabilities. The whole of 
this labour is escaped, if we work with the mean contingency instead of the mean 
square contingency; further, since in this case we only sum for the positive con- 
tingencies, neglecting the negative, we have usually to deal with only, or often less 
than, a moiety of the terms involved in calculating ¢’. On the other hand, there is no 
simple relation between the correlation and the mean contingency such as we have 
found between correlation and mean square contingency in equation (viil.) above. 
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The relation is far more complex and is only expressible in the form of integrals 
reducible by quadratures. Still, for practical purposes we rarely want the coefficient 
of contingency to more than two decimal places. Hence, if the integral be evaluated 
for the coefticient proceeding by equal intervals, we can plot a curve giving the value 
of the coefficient of contingency in terms of the mean contingency, and this will be 
sufficiently accurate to enable us to read off the former in terms of the latter to the 
required degree of accuracy. The enquiry also brings out some other points not 
without interest.* 

To investigate the curve which in a normal correlation surface separates on the 
plane of xy areas of positive from areas of negative contingency. 

The frequency due to independent probability will be equal to that due to the 
actual contingent probability, when 


N Ls (4+ v“) ore N it Seo (5-224 “) 


Sree Pe ox” oy? wks “1-7? \ox* azoy oy? 


= ———— ? 
270 20y 2700, a ae re 


where * is the coefficient. of correlation, or of contingency. 
Clearly 


g Q 
(1 —7*) log. (1 — 7?) = — 1” i. = ae + a Tact Ry io 

Since r is always less than unity, this curve is clearly a hyperbola, which possesses 
several interesting properties. We see at once that all the contingency of one sense 
is grouped into the space between the two branches of this hyperbola, and that the 
contingency of the other sense is grouped into the two separate spaces inside the two 
branches. Thus contingency of either sense is for normal correlation continuous, and 
abrupt changes of sign in the contingency—beyond the limits of random sampling— 
are not to be expectéd. 

By testing on actual correlation tables I find this hyperbola comes out in a fairly 
marked manner, in fact, quite as significantly as the elliptic contours of equal 
frequency. 

I propose to consider the properties of this zero contingency hyperbola—it forms 
the curve along which two really contingent events have a frequency identical with 
their independent probability. 

Consider the two families of curves : 


a Sara aaa 

py ame a? sais et! As ‘ = Y ° ° . . . (=), 
Ox Troy Oy 

on 9) vy 2 is 
Bo Bey ee 
toric LY O7Oy Oo," 


* T have to heartily thank my assistant, Dr. L. N. G. FILon, for the substance of the first part of the 
investigation given below, down to equation (xiii.). I owe the calculation and plotting of the curves 
wu = e-*8e¢® to my assistant, Mr. J. C, M. GARNETT, 
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Since 7 is always < 1, the « family form a set ot concentric, similar, and similarly- 
situated ellipses, and the 6 family a set of concentric, similar, and similarlv-situated 
hyperbolas. Any conic having double contact with the hyperbola By, of zero 
contingency defined by (ix.), at the ends of a diameter y = mz, has for its equation 


ey 


Ger Boe y? 
2 tt 4 FN ly — ma}? = By 
Oe VPage Ty 


If this be identical with an ellipse «, we have, by comparing coefficients and 
eliminating \ and m, 
2/42 — 
By?/o? = 1/r°. 


Consequently « = +78), the sign being determined from the fact that « must 
always be positive for real ellipses. 


Now the ordinate z of the normal frequency surface is given by 


eee ae e 2(1—7*) ; 


QT xy sf Aimee 


and to find the mean contingency we must determine the whole volume lying inside 
the two branches of the above hyperbola, integrating on both sides of the line of 
contact of the families of hyperbolas and ellipses.* 


We have ik dx dy over this area : 
[a yee a eae tt alr e 20-1) 
apace Qt Oxy /1 — [de ls Typ dp, 
where 
8 (x, y) Ouro y on oF 


from (x.) and (x1.). 
But from (x.) and (x1.) 


(2+ %,)- 24} ap =(@- 8) (=P) 


5} 2 
we Ox Oy / Ox Oy 


Or, choosing the signs to make J positive, we have 


ja# vl-r ots Bie. 


1OyCy 


'* The ellipses and hyperbolas have common pairs of conjugate. diameters; one line of contact is one 


of the asymptotes of the hyperbola = - i = 1; and tangents at an intersection point of any of the 
4 Ty- oy” 


family of ellipses with any of the family of hyperbolas are respectively parallel to conjugate diameters of 
this hyperbola. ‘These geometrical properties, however, need not detain us here. 


B 2 
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Thus the required integral is 


|" 


Oye [ ] | en ar) 
— i Oo Me oe SEK 
; T (1 _ yr?) 1B By A J a> = iad 
cos7! Bor" e- =" da. 
a 


aap ENS [ 
ee 2Q7 (1 oes 7?) J op, 


To simplify put, using (ix.), 


r 1 9, as 
wish tens! Ye ate =— gp loge (1 — 0 versa heer 
where & will always be positive, since 7 < 1. 
We have 
Le é facta ? sec 0 tan 0 dé, 
‘ cL eat 


or, integrating by parts, 
I= 


7 


[2 eee 
| ohne? PO eas Saas ess ee 
0 


The curves u = e~***’ were then plotted with our coordinatograph for a series of 
values of & or r on a large scale, drawn in with a spline and integrated with a Coradi 
compensating planimeter. The values of I, resulting are tabled on p. 15. 

We have next to investigate what is the volume NQ, of the surface of independent 
probability 


.N pie 
| et (Seep 
270.0 


which falls within the same hyperbola of contingency. We shall then have in Q, — I, 
the required value of w, the mean contingency on the basis of normal correlation. We 
have 


= Laiss (| eal) dx dy 


ri 27070 y 
taken over the space inside the two branches of the hyperbola 


9 
a 
2 


2a y* 
— sy ce = Bo- 


1x0 y Cy 


ox 


Write « = a’c,, y = y'o,, and we have 


Transform to polars, p cos 6 = x’, psin @ = y/, 


p= jo Rama 
7 — sin 26° 
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This shows us that the axes are given by 6 = i and ; + a or are a and b, where 


i= 7B /(1 +4), 6" = rBf(1 — r). 


Take these axes as axes of coordinates. Then we have to integrate 
— 1 [6-474 ded. 
; Q, wi af é vay, 


over the area inside one branch of the hyperbola 


eee (lo eae tna es OF wee. |. (xtY.). 
Let 
foe + y a. 
: (xv.), 
ae se y° + rie + y’) = 7B 


and let us transfer the integrations to @ and £. 


We have 


and 


over one-half one branch of the hyperbola. 


Py saAS dade Sia ee 
pie rae Mdadyt CAG ore et (« — B)’. 


Thus we have 
0 I+7 
es d | es AB. 
* / a2 — (a a By : 


(xvi.). 


The limits are obtained from the consideration, easily seen on a figure, that for a 
iv e must integrate from 8 = f), the given hyperbola, to B=“ "a, tl 
given « we must integrate from 6 = ), the given hyperbola, to 6 = Speapthe 


touching hyperbola; and then for « we must take every circle from that touching £,, 


2.¢., 4 = 78,/(1 + 7) up to infinity. 
We will first integrate with regard to 8, and put 


r(a — B) = — asin ¢. 
This gives, when 8 = (1 + )a@/r, 6 = 47; and when 


B = £,; p= sin7} tees = hy. 
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Thus we find 


ap ee ee 


i+r if “ltr 
Take 
cos xy = (By — «)/a, 
then 
alo cos x = —7, 
; a=r8,/(1 +7), cosy =-1. 
Hence 
1S beanie 2. r& 1° By Sin 
1E —- =| “a+ COS o_ Xx > 
Q 2irJo Xe “(r + cosx)? * 
1 


aie ") 1 7By l 
= — C454 cosx 0 
7 -0 X 


observing that the term between the limits vanishes at both. 
Take 
cos 8 = (r + cos x)/(7 + 1). 


x = 0, Got, 


Then 


X= cos”! (— ”), Ge 4a. 


Thus we find finally, after some reductions, 


—— 1 ie —K sec 0 1 + cos 6 Utes 
Ors a aS dé (xvill.), 
where 
=(1—r)/(1 +7), 
a 7B, sete 1- r eer 5 4 3 5 (xis): 
a 2 t =f , 2 r l (1 y ) 
= (1 —7r)k, of the integral L. 
Tables were now formed of ¢ and « and the ordinates of the curves 

DA age 1 + cos 0 

Mn emmupery ect tee 


calculated.* These ordinates were plotted on a large scale by aid of a Coradi 
coordinatograph and the resulting curves integrated as before, the values of Q, thus 
found are given with the values of I, and # in the table below. I believe this table 
gives the mean contingency in terms of the correlation true to at least three places of 
decimals. The uv and v curves are both interesting analytically and subject to rather 
curious changes of type. We were aided in plotting them by calculating, where 


* T owe the calculation of these ordinates to Dr. Auice LExr. 
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du dv 
— and —. 
dé Ci | 
Mr. L. W. AtcHERLEY, to the corresponding values of y. Thus a curve was obtained, 
which enables us to read off the correlation from the contingency correct to at least 


needful, Finally, the values of + were plotted by my demonstrator, 


two places of decimals—sufticient for nearly all practical purposes. 


TaBLe [.—Table of Integrals [,, Q,, and the Contingency # for Values of *. 
2; 1p OR | y, | 
| | | 
0-00 *5000 *5000 . *0000 | 
°05 *4620 -4762 °0142 
10 | *4342 "4652 *0310 
"20 "3895 | "4536 | “0641 
30 “3501 | "4498 . “0996 
40 *3162 “4547 *1385 
“DO *2830 4643 ‘1813 
“60 “2489 “4814 | "2325 
‘70 *2128 *5106 | 2978 
“80 | ‘1700 "5524 "3824 
“90 . *1186 6279 | “5093 
*95 . “0796 *7009 | °6213 
1:00 | ‘0000 1-0000 1-0000 | 


| 
| | | | 


Diagram I. at the end of this memoir will therefore serve for most purposes of 
interpolation, and it will be seen that now that the integrals have been evaluated and 
the diagram constructed, the correlation can be very easily found from mean con- 
tingency. But the method seems to me distinctly inferior to that of mean square 
contingency, and this for much the same reasons that mean error calculations are 
inferior to mean square error work in curve fitting. Further, the grade of contingency 
can be found at once from a knowledge of mean square contingency, and whatever be 
the distribution is a significant and interpretable constant. ‘This is only true of the 
correlation deduced from mean contingency if the distribution be normal. 

(5.) To sum up our results so far :— 


We have, if 
Ny» be the actual frequency of a group in the population, N which combines the 
characters A, and B,, v,» be the frequency of this group on the hypothesis of 
independent probability, then 
Nyy — Vy 18 Simply a sub-contingency, 


Nin — Vn ; 
S [Ge rel | = y” may be termed the square contingency, 
Vuy 


9 

(ta aK aN a Oe . 

S {Ga tel | = p 1S the mean square contingency, 
uv 


\ 
Nun — Vu Sey : ey: : 
3 (Tee =m) =, where } is the sum for positive (or negative) sub- 
contingencies only, is the mean contingency. 
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Any one of these expressions is a measure of the deviation of the system from 
independent probability, and therefore of the amount of association or correlation 
between the characters or attributes involved. But any function of these expressions 
is also a proper measure. Such functions are :— 

(a.) The contingency grade. This is 1 — P, where P is to be found from y? by aid 
of the tables for “ goodness of fit.” See ‘ Biometrika,’ vol. 1, pp. 155, et seq. 

(b.) The mean square contingency coefhicient = C,, where 


= lads 


(c.) The mean contingency coefficient = Cy, where C, is to be found from the table 
on p. 15 or from Diagram IT. at the end of this memoir. 
In the case of sufficiently small grouping and normal correlation we have 


C, = C, = coefficient of correlation. 


3ut it must not be forgotten that this is essentially a limiting, not a general case. 
Nevertheless the approach to equality of the two contingency coefficients will be a 
good measure of the normality of the distribution and the suitability as to smallness 
of our elements of grouping. 

(6.) A little experience of actual working, however, shows that in practice it is 
perfectly easy to overshoot the mark in fineness of grouping. Suppose that in 
dealing with 1000 cattle we find a single instance of a calf inscribed as ‘“ mulberry,” 
say the offspring of a red cow by a dark fawn bull. Now if there be 30 dark fawn 
bulls, the independent probability of a dark fawn bull having a mulberry offspring 
is ‘03. Hence the sub-contingency for a d parent-oftspring table = 1 — 03 = ‘97, 
and the corresponding contribution to the square contingency will be (-97)?/03, or 
is upwards of 31. The fact is, that when we come to very fine groupings we get at 
once into difficulties owing to our having to record by wits only. Suppose 
“mulberry” calves actually had no relation to any special parentage, but were rare 
anomalies occurring once among 1000 calves, or perhaps were merely an odd breeder’s 
fancy description, then a unit cannot be divided in the proportions of the colour 
parentage, it must fall into some one colour parentage group. The result is 
that a few isolated individuals will give large contributions to the mean square 
contingency. The above example is purely hypothetical, but similar cases have 
actually occurred in dealing with colour problems by the contingency method. They 
are exactly similar to those which occur when dealing with outlying individuals by 
the test for “ goodness of fit.” In a frequency distribution we proceed only by units, 
but the theory gives fractional values of the frequency ; hence in forming the value of 
x’ to measure goodness of fit, one or two unit “outliers,” although not improbable as 
far as the whole of the tail of a curve is concerned, may be exceedingly improbable if 
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considered from the standpoint of the actual group in which they do occur. This 
point must be carefully borne in mind in actual practice, for by sufficient refinement 
of grouping, 2.e., till we reduce certain groups to a single individual or two, the mean 
square contingency can be increased in a remarkable manner. 

(7.) Of course this is merely saying that the probable errors of the sub- 
contingencies increase largely when we make »v,, very small. Unfortunately I have 
not yet succeeded in determining the probable errors of the contingency coefficients. 
If cy. be the contingency, determined by 


and >, its standard deviation for random sampling, I find 


\ 


Nay Nn An Ny n 3n,Nn a 
Dew = Nu (a a ny aE Ne (1, + Ny — =) — 2 N (n. + Ny = N *); . (xxi.), 


so that the probable error of any individual contingency = ‘67449 &,,, is determined. 
Further, if R.,,,..,. be the correlation between errors due to random sampling in two 
contingencies ¢,,, and c,,,, n0¢ belonging to either the same row or column, 


>A Ae a nes me + 2 Maete ses 
é / PLA Tes eee 
af +: ae ed a ae (xxiii. ). 


Similarly we find for the correlation of errors of two contingencies of the same 


column, R..,..,, the result 


Ney May NeNayyt AH Ny Nyy 3Ny, 

> > R R ae ee OR awe eee le (1 Athos ) 

Cw Cun! ~~ Cny” Cuv N N ' N ' 
eee a) .. (xxiv.), 

N? | N ° Coe 


and for errors of two contingencies of the same row, 


\ 
SSR em uote Malte EMM (1 nBee 
Cuy Oy! y Cur’ N N 


pate ( - =e eee”. ) (xx¥,). 

Results (xxii.) to (xxv.) enable us to find the probable errors and the error 

correlations for any individual contingencies which will arise from. random sampling, 

and are so far of value; but when we attempt to find the general expression for the 

probable error of either the mean or mean square contingency, it becomes so complex 
© 
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that there appears little hope of deducing a simple result. Arithmetically the 
problem might be solved at the expense of rather troublesome numerical calculations 
if the number of sub-groups was not very large. A general and simple expression for 
the probable error of y or ¢? involving or ¢* only does not appear likely to exist, 
and an expression involving all the sub-group frequencies would be very troublesome 
for computation, Practically the errors of the contingency coefficients may be fairly 
reasonably taken to lie between the probable errors of r as found by a fourfold 
division of a table and by the product method, approaching the latter more closely as 
the number of sub-groups is sufficiently increased. With the experience of probable 
errors of fourfold tables before us we may, I think, safely take the probable error of a 
contingency coefficient C for rough judgments to be less than 
1—C 


2 X °67449 eae 
J/n 


2.e., double the probable error of a correlation. coefficient found from the product 
moment. At the same time we must distinctly be cautious, remembering the difficulty 
as to isolated units referred to in the previous section. 
We may look at the probable error of the contingency from another standpoint. 
Taking the mean squared contingency, we have 


S 1 
1+ ear, 


re 


Therefore 
oi 2r ; 
o¢ =i ry ér, 
and accordingly, if %4», =, be the standard deviations in errors of ¢* and 7, 


2r 2r 1—7?* 
Lo = = 


1 r (1 —  /N 


2 2 ee aes ath 
Nee nv? Paes) 


Hence if we were to determine ¢* from 7, the probable error of ¢? would be 
given by 


2 


Probable error of ¢? = ‘67449 TR JV(1 + 6%) ¢ 


Or, we can put it into the more useful form, 

134898 | /1 4+ # 
JN ¢” 
Thus the percentage probable error increases rapidly as the contingency gets smaller. 
* «Phil, Trans.,’ A, vol. 191, p. 242, 


Percentage probable error of ¢? = (xxvi.). 
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Of course, the probable error of ¢° as found from 7 is not necessarily the same as 
the probable error of ¢* found directly, but it may serve as a guide to its approximate 
value. 

If it were the same, the probable error of 7 as found from ¢? would be 
67449/{(1 + ¢*) WN}, a result, as indicated in the previous paragraph, much too 
small, except possibly for very successful systems of grouping. 

(8.) To find under what other condition than normal correlation small changes in 
the order of grouping will not affect the value of the correlation. 

Let us assume the unit of grouping to be very small, but not necessarily the same 
for all groups. Let the two characters or attributes be a and y, and suppose n, to 
be the total frequency of individuals in the range y, — € to y, + €, and 7,,, to be the 
total frequency in the range y,,,; —¢’ to ¥,4, +. Let 44, -—y,=e+te =h be so 
small that its square may be neglected. Let «x, y be the mean values of the 
characters, N the total frequency. We will find the changes in the moments and 
constants supposing the array », and 7,,, interchanged in position. 


Clearly dz = 0 and dc, = 0. 
N (y + 8y) = 8 (yns) + he (M5 — 541); 


or, 
dy =h (ns — Ms4,)/N. 
N (a, + 80,)? = 8 (ys?ns) + 2h (Ysts — Yo41%s41) — N (y + dy)’,* 
20, do, = oh (Ys — Ys4iMs41) — 2Ny by, 
da, = h (Ys — y) ie (ean, — y) Ns41 , 
oy oe N 
Next if 
P=S (ay) — Nya, 
P+ SP =S (ay) +h (1s — 544%s4,) — Nyx — Na 8y, 

or, 


SP =h {n, (%, — ©) — N54) (%s41 — XD}, 
where «, and a,,, are the means of the arrays n, and 7,,). 
But if 7 be the correlation coefficient of « and y characters, 


sat Bite sgh 
ru No,o, 
Therefore 
br _ 8P _ 80, _ Say 


r im Or eg 


* It must be noted here that the squares of the change in y and cy, are neglected. Hence the changes 


must not be so great that dy and éc, are sensibly as compared with y and oy 


co 2 
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and substituting the above values, 


ora {nla — @) = Mey) (Gee) — B) __ (Ye — Y) Me = (Your toss 
7 ind No 


If this is to vanish for any value of s and A, it will be sutticient, since 


| Bp 4 ke ets 


= ~ ee z ao 
and 
—— a TO we, 
Vey TT OS “a, wen =) 


Or, if the mean x, of any 7,-array of individuals be determined by 


——e — Vo —_— 
Ly - tS 4 (Yn hae y)- 


But this is the condition for linear regression. 


Hence we conclude that in any correlated system of variables, obeying the law of 


linear regression, we can, without sensibly modifying the correlation, interchange two 
adjacent y-arrays (e.g., two rows of the correlation table), provided the grouping be 
fine. But if we can interchange any two adjacent y-arrays, we can, by a repetition 
of such changes, interchange any two y-arrays whatever; and a precisely similar 
statement must be valid for any two «-arrays (e.g., two columns of the correlation 
table). Hence, given a sufficiently small system of grouping, we may state that in all 
cases of linear regression the actual order of the scales is immaterial as far as the 
determination of the correlation is concerned. 

The practical importance of this result would appear to be great, for it frees us 
when dealing with scale orders from the need for supposing normal frequency ; the 
indifference of the scale order when determining correlation is still true, provided the 
regression is linear; and this linearity of regression is not only found from observation 
to be very general—for example, in inheritance problems*—but follows from theory 
itself in the case of various hypotheses. + 

In actual practice, of course, the degree of fineness of the grouping is limited by 
many considerations, and hence it will often be better to proceed by the fourfold 
division method, taking that division where possible at a very distinct classification. 
But the general principle now demonstrated will enable us in future to pay much less 


* See “The Laws of Inheritance in Man.—lI. Inheritance of the Physical Characters,” ‘ Biometrika,’ 
vol. 2, pp. 362-3; also ‘Inheritance of Mental and Moral Characters in Man,” ‘Huxley Memorial 
Lecture,’ 1903. ‘Journal of the Anthropological Institute,’ vol. 33, pp. 185-7. 

+ “Contributions to the Theory of Evolution.—XII. On a Generalised Theory of Alternative 
Inheritance, with special reference to MENDEL’s Laws.” ‘Phil. Trans.,’ A, vol. 203, p. 85. 
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attention to the actual order chosen for the scales if we are dealing with a class of 
characters for which we may reasonably presume the regression to be sensibly linear. 

(9.) It we take the crudest possible division of our material into only four groups, 
tus 


a ¢ i BE | 
| = = 
i] / d+h 

a+ d | e+ 4 | N | 


corresponding to what Mr. Yune has termed the association of two attributes, we 
have at once 


Vi eS cd) 


(xxvil.), 


gs sie ceed agg Neen tay 5S 
(a+ d)(e+b) (a+ ce) (d+) 

Now it is clear that in this case ¢? reduces to 7°, where riz ‘is the correlation 
between errors in the position of the means of the two characters under consideration, 
as determined by a fourfold table, and }w is in this simple case what I have defined 
as the transfer per unit of total frequency.* Both are expressions intimately 
connected with the conception of association, and have already been discussed in 
relation to it.t The coefficients, C, and C,, of contingency—either of which might 
serve as a measure of the association—will not in this simple case, however, be 
necessarily even approximately equal to each other, still less to either the coefficient 
of correlation or Mr. YULE’s coefficient of association. 

It is worth while illustrating this on a numerical example. ‘Taking the small-pox 
returns for the epidemic of 1890, we have :— 


Cicatrix. | Recoveries. | Deaths. | Totals. 
Present... 1562 | roe 1604 
Absent | 383 94 | 477 
na C « Seded DA ageeol : 

Totals.) .. | 1945 136 2081 
| | 


* «Phil. Trans.,’ A, vol. 195, pp. 12 and 14. 
t Lbid., p. 15 et seq. 
t ‘Phil. Trans.,’ A, vol. 194, p. 272. 
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These give us 6? = ‘0845, xy? = 175°76, # = ‘0604. From these we find 


Seen: C= e190. 

YuLE's coefficient of association = ‘803. 

Coefficient of correlation by fourfold division = ‘595. 
Grade of contingency = 1 — P,* where P = 718/10*. 


Now so far as numerical values go these things are all totally different. C,, C,, 
and the coefficient of association depend very largely on where the fourfold division is 
taken.+ It is extremely difficult to use them therefore for comparative purposes. On 
the other hand, the coefficient of correlation with the assumption, however, of 
normality is free of this restriction; it brings us into line with other things for 
comparative purposes. The grade of contingency is also independent in a sense of 
the division, 7.e., it has a definite physical meaning. What it tells us is this, that the 
deviation from independent probability in the relation between result, a case of 
small-pox and presence or absence of cicatrix is such that the above table could only 
arise 718 times in 10" cases if the two events were absolutely independent. 

If, instead of a table like the above, we take a number of alternative possibilities 
for each attribute, the coefficient of association loses its uniqueness of meaning ; 
C, and C, still retain their significance, and as the number of alternatives become 
vreater, merge in the coefficient of correlation. The grade of contingency, on the other 
hand, retains the same perfectly definite meaning throughout. I think this statement 
may serve aS some warning of the caution needful in using the coefficients now 
introduced. The degree of approach of both C, and C, to the correlation must be 
studied for each special class of cases, and only when this has been done will their 
use be really legitimate and effective. 


(10.) On the Relation between Multiple Contingency and Multiple Normal 
Correlation. 


Suppose instead of a single correlation table we have a multiple correlation system. 
Such a system is well illustrated by the cabinet at Scotland Yard, which contains the 
measurements of habitual criminals on the old system of body measurements, now 
discarded in favour of a finger-print index. We have in this case a division of the 
cabinet into 3 compartments, which mark a threefold division of long, medium, and 


* When the number of groups = 4, we have (‘ Phil. Mag.,’ vol. 50, p. 157 e¢ seg.) -— 
P = RE | e-3? dy + J? e-tey. 
TSX Tv 


[3 1 1 3 15 
= , — 4x? sage SD aoe 
Ni ts x x{1 xe =i x! xe + ae 


whence P is easily found if x? be large. 
7 YULE, loc. cit., p. 276. 
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short head lengths. Hach of these vertical divisions is then sub-divided horizontally 
into three divisions giving the corresponding divisions for head breadth ; each of these 
head-breadth divisions has three drawers for large, moderate, and small face breadths. 
Each drawer is sub-divided into three sections for three finger groups, and these again 
into compartments for cubit groups, and so on. If this be carried out for the seven 
characters dealt with, we should have ultimately 37 sub-groups forming a multiple 
correlation system of the 7" order.* We may ask what is the mean square 
contingency of such a system and to what extent does it diverge from an independent 
probability system? Of course, for an ideal anthropometric index system the 
divergence should be very slight. 

Let x,,x,... x, be the n variables of a multiple normal correlation surface, to which 
the equation is 


N Ry, 5" He t,t, 
— — ex t. —4{8 ( we te) + 28, eat 
(2rr)¥o,0,...0,/R F eRe og PER oT, 
Here oj, a)... 0, are the standard deviations of the n variables; 5, denotes a sum 


of all values of p from 1 to 7, 8, a sum of all unlike values of p and q from 1 to n; 
while R is the determinant 


ma ee rg ee dee NG WR « a 
Port Mein hoes ete ma eS Ton 
Tepe tae LSE peer es SLOT gn, 
Try ) Tne > Trg ’ : * . : 1 


and Ry is the minor corresponding to the constituent 7,, and the 7’s are the 
correlation coeflicients. T 
Now if ¢* be the mean square contingency, we have 


re 


—2 


where z, = value of z when all the 7’s are zero, or 


N | (2 :) 
= : t. —448,(-7,)>. 
“0 (Qar)™ ojos... On ae ol FP up? 


Thus we have, writing x, = o,2’,, etc., 


= a (he a ie — Be: @ — 26+ t)) Ges dda OL 4, 


ar) —o J—o . 


* See MacponELL, “On Criminal Anthropometry,” ‘ Biometrika,’ vol. 1, p. 205 ef seq. 
+ ‘Phil. Trans.,’ A, vol. 187, p. 302, or Zbid., A, vol. 200, pp. 3-8. 
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where 
I R 3 R 
C= expt. —148 ( =H? 12) + 28,( = «/7)| ; 
ZR } A 1 R 8 20 ad, 
— 1 / 2\) 
Cy = expt. — 3 {8 (#,’)5. 
Now 
(f+o f+to 6t+o +o 
| | in | en Bene s+ ener + Hey egeagt ..) ar, dx’, Sher) di’, 
—n+—D 4 —mD —m~ i 
= (2a) Ane ae spol aes 
where 
, | . a 
B= Cris Chay Cigs ee ee a 
C21, C22, C23, > a Con 
C315 C32, C335 * . a Can 
Cary Cn» Cn3s c * : Cra 


] 


(xxix.), 


We are now in a position to find all the integrals involved in the equation for ¢?, 


1, we have, if p and qg be 


we have 
9 \ ee it 
=i — 24+1=>—— — 1, 
? R Ja’ ay RY A’ 
where 
A’ = PALE 1 2Riy 2K, 2Rin 
R ; Lee Ree R 
Dy, babe y ee 1 2Res 2 R,, 
R > R > R b) ° R 
2K, 2Ry 2Rss 1 2Rosn 
R b) > R > R 
2K 2K 2K. 2Rin — | 
R hs R inet ay | 
To evaluate this determinant, we notice that since r,, = 
difterent, 
Ron + Rogtng + Rost'pg +. - - + Renton = R, 
Rot + Roary + Rogrys +--+ Ronn = 0. 
Hence 
2R 2R ZRow 
alt, + 29 oes tm +. + (PR —1)1 Reg Oe Bie fs ae 
and 
2R. 2k. 


R R 


Peel rg + 7 Bee 9g + oes tat... + (Ae — 


1) Comit + 
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-Now multiply the determinant A’ by the determinant R, we find, using the above 
relations, | 


yee 
A Rh ae iF — 12, mat is ets . . re re 
Poy Ly al on 1H, 
als Tot ngs i . . . arsed LAE 
Pris —— P25 peas Ur Te Mas : . if 
/ 
= Ly, Say. 


Here R’ is R with the sign of all the correlations changed. Hence it follows that 


ree ls 
5 Cele gH 


ee aMca hye ewes 2 cs, Y!. (XEX.); 


me P Special Cases. 
(i.) Simple correlation 


/ ¢ 
W=R=1—7,%, and ¢=7,,7/(1 — 7,2), as before. 
(i.) Triple correlation 
R= 1 = 195° — 15)? = 14 Q? + BW o9h'gi7 p05 
Ri = 1 = 195° = 9g)? = 14g — BW yghayMyo- 


=> ! E == en | 
V (1 — Tog” — 13)” = Typ")? — 409313)" 9" 


(i1.) Quadruple correlation 


ln gi i ORE, Deed A ee ea aD be Oe ee ye a a ee ee a, 
RR’ = {1 P}3° TV 13 V4 — og" — Vay — Veg 1119 M34 Hog Tig F113 Veg — 2 (1197 14097 34 
H+ 7147137'237 24 1H 1497'137'247 34) 3 A {193794734 1347 al 13 11197 4724 HE 119713793 $5 


and so on. 

Clearly a condition has to be satisfied among the correlation coefficients, or the 
process by which we have deduced ¢? is not legitimate. We must have A positive for 
equation (xxix.) to be true. Now, for normal correlation R must be real and positive, 
or the equation to the multiple correlation surfaces become imaginary. Hence it 
follows that A’ must be positive, and therefore R’ must be positive. This seems to 
give a definite condition to be satisfied by the correlation coefficients, and in some 
cases rather narrow limits are enforced. For example, in the case of triple correlation 
we must have 

Le P9g” = 15° = Mya? = 2 ogh's1% 12 
positive, and this appears to reduce very considerably the possible values for the 
D 


Stature of Father and Son. 


TABLE II. 
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Stature of Father. 


SIG IG WH ID 1D IDO Iwsq II OO SCSSwSeoos 
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yi 2 2 
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wD S3 1D 
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ores | | Pl] 2eea Baar pad 
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on a8 EASE A i 
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MONA ees ey 
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2 aM NANNR eS oS 
Gp ea Otoe (o) © onan iil eee 
if Sh aie Eire aee 2 
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* By a printer’s error unfortunately given as *1 in ‘ Biometrika. 
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correlationship of three characters.* The source of this novel condition appears to lie 
in the integration of the term @?/{, and this is only possible by use of equation (i.), 
provided the surface Z = @°/¢, has “ ellipsoidal” contours. If it has not, we may get 
the subject of integration becoming infinite with one or other of the a’s, and 
consequently, although both ¢ and ¢ vanish at , ¢?/¢; may not do so, z.e., the mean 
square contingency tends in certain tracks to become indefinitely large. In fact, our 
method of deducing multiple contingency from the normal correlation coefficients is only 
valid provided the system is not only a possible correlation system with the given values 
of the coefficients, but also when these coefficients all have their signs reversed. 


(11.) Illustrations. A.—Stature in Father and Son. 


Table IL. gives the distribution of 1078 cases of stature in father and son.f The 
correlation 7, as found from the product moment in the usual way, is ‘514. 

I propose to consider the approach of C, and ©, to 7 as we increase the fineness of 
the grouping. Clearly it would involve extreme labour to work out the contingencies— 
especially the mean square contingency—for the table as it stands. 

To begin with I classed in three inch groups and got the following table, in which 
the figures in brackets are the independent probabilities. 


TABLE III.—Stature of Father and Son in Inches. 


Stature of Father. 
. - . | . . . vi 
3 ‘8 s ee = Fi Totals. | Chances. 
ce | =H | i <5 oD iio) | 
co © | co i it I~ 
id . us | Bs a | ie 10 
oom il a = <H i= r) oF 
1d to) Yo) rte) ~ i 
58:5-61:5 | — 15 a ae ne = 3°5 | +0032 
(05) (+36), (1+20)| (132), (-50) | (-03) | 
61°5-64°5 Pee OR eal 33 | DAD ais ele — | O27 ODS0 
| (-84)} (6°50)| (21°75)| (23-87)| (9-02)| (-55) | 
= 64°5-67-5 S20 53:75 148 . 80°5 8:25 ae | 299 97T74 
R | (4:02) | (31°07) | (104-03) | (114-15) | (43-14) | (2°64). | 
Ss Gi2D=10-D 1 .2°5 | a3 ep) 149:25 202 +25 60°25 “ao 3) 451 4184 
2 (6-07) | (46°86) | (156-90) (172-17) | (65-06), (3°97) 
2 ORY (RIC = 3°5.. |. 39 pea Ode 26 62 oaie) | 213 | -1976 
oS y Fs a | 
5 (2°87) | (22°13)) (74°10) (81-31) | (30°73) | (1°88) | | 
‘e 73°5-76-°5 | | — PS Peele bthe qacd0 Oto) 2b i |) 41-6 | +0885 
(-56)| (4°31)| (14:44)) (15-84)| (5°99)|  (-37) 
16°D=19-5 — — =n | 4°5 3 a (Cs -0069 
C2 EO) hol 20) (2°59) (2°84)| (1:07)| (-07) 
. z 
| 7 c | | “it VEG tse. ¥ 
Pe Lotalseana | 14:5.) 112 375 PRES pos} 155°5 | Orb 1 1078 1-0000 
| ) ) | | | 


* For example, if -5 be the value of parental correlation, then the correlation of two brothers could not 
exceed *5 without making R’ negative. 
+ See ‘Biometrika,’ vol 2, p. 415. 
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The independent probabilities were found by multiplying the “chances” of a son 
occurring in each group by the totals for each group of fathers. Taking the difference 
of the observed sub-group frequencies and the independent probability frequencies, we 
have N X W = 205°62 from the positive and = — 205°66 from the negative differences, 
a quite good agreement. Hence we find » = *1908. 

Using Diagram I. we have 

C, = °522. 


Proceeding now to the mean square contingency obtained by squaring all the above 
found contingencies, dividing each by the independent probability frequency and 
summing, we find 


op? = ‘2755, 


C, =A Ge 


whence 


The value of C, is clearly too small. We must infer that our grouping was not 
fine enough. Accordingly in Table IV. I have re-arranged the matter in 2-inch 
groupings, and have then in the same manner proceeded to find p and ¢*. In this 
case I found # = ‘2013, and thus 


CO, == 1542, 
while 

g”? = °3568, 
and 

C, = 513. 


I thus conclude that the grouping is now fine enough to give C, and C, 
approximately equal to the correlation.* 


* i... within the probable error of that result. 
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TABLE IV.—Stature of Father and Son in Inches. 


Stature of Father. 
| | = 
9 19 no 1s 12 2 oo) | 19 | | Totals. | Chances, 
vor ak) (aden ak al. i a 
1S be a [| 19 19 1 419 19 ee ee 
6 2 | & = S S mm. Freee | = | 
— ——, -— —— — — —__—— | -_— Fie —— — —_— —_— i 
59 5-61 °5 — 125 1 ik ; oo o— _— 3°5 00825 
ne on (08) (°31) (°77).|. (95) (84) | Cay) ( 1) Hat( 02) 
| 61 5-63 °5 2°75 5°75 Se0ns la a0 "2D 24 02226 
14) (56) | (2°11) | (5:29) | (6°49) | (5-73) @ $3) (: 72) | ee 2) 
63 5-65 °5 175} 20 41 °5 17°25 8°25 — | 100 09276 
( 60) | (2°32) | (8°81) | (22-03) | (27 -04) | (23 89) | (11: 78) (8-01) | (51) 
3 65 *5-67 °5 10 32 73 Pree ll BETES . (ber amy Sat _— 287 °5 "22082 
s a 43) | (5°51) | (20 98) | (52-88) | (64-22) | (56-73) | (27 $5) 7 16) (1-21) 
n 67 5-69 °5 — 45 27°75 65°5 | 95 93 -25 31°5 1 323 29963 
“3 (1 *95) | (7 -49) | (28°46) | (71°16) | (87°34) (77°15) | e oe @ 74) (1°65) 
2 69 °5-71 5 —- — Gr7beles8"25) | 6 7 coed 2 236 *21892 
2 (1-42) | (5-47) | (20°80) | (51-99) | (63 -82) | (56°37) | (7 30) | (7 11) | (1-20) 
= | 71°5-73 5 | — — | *25 5°75 | 24°75 345 32 sa es 5 105 09740 
oo (68) | (2°44) | (9°5) | (23-18) | (28-39) | (25-08) | (12-37) | (8-17) | (54) | 
13°5-75°5 | — es < 3 O25 1 6764 13 (eee? sg 37°5 | -08479 
(-28)] (87) (8°31) | (8 -26)| (10-14) | (8-96) | (4-42) (1-13) | (-19) | 
155-975 | — Pape Pe = 2°5 15 15 | 25 — | 8 | -00742 | 
(-05)) (°19)| (-70)| (1°76)| (2°16)| (1°91)| (-94)| (-24)| (-04) | | 
77 5-79 °-5 — == —- — — 2 | 5 HT — 8:5) "00325 
(-02)} (08); (°81) (77) ( 95) (84) (41) | (11) } (-02) | 
| | | | ] | 
Totals. .| 6°5 25 95 237 5 291°5 | 257°5 127 | 32 °5 5°56 | 1078 | 100000 
| i 


To show the effect of too fine a grouping, I worked out the mean contingency for 
the inch grouping in Table II. There resulted 


ve =°2309, giving Cy = 597, 


I therefore conclude that with sufficiently fine grouping the new method of 
contingency will give contingency coefficients sensibly equal to the correlation 
coefficient. But that with over fine grouping, the effect of individual units scattered 
here and there at- random over the table, becomes influential and exaggerates the 
value of the correlation. Hence, when a correlation table can be formed and worked 
in the old ways, there is little doubt that it is safer to do so, and the labour will 
hardly be sensibly greater, at least when compared with the method of mean square 
contingency. I have not faced the labour required to determine the mean square 
contingency of the table with 340 sub-groups. Dr. Lee has worked out the mean 
square contingency for a table with 400 sub-groups, and we do not think it desirable 
to deal with a table of more than 10° to 15° entries again. Still the mean square 
contingency coefficient will hardly be as great on the full table as the mean 
contingency coefficient. 

The following table gives the results :— 
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Comparison of Methods of Finding Correlation. 


No. of | Mean Mean square Fourfold Correlation 
groupings. contingency. contingency. division. table. 
- ae 2 fae nee 2-2) 
42 | "522 | 465 (Mean of six divisions)* | —- 
90 | 542 . ‘513 “550 _ 
340 | "597 / ~- — "514 


Thus the first contingency method approaches the fourfold, the second, the 
ordinary correlation method. 

Diagram IT. at the end of this memoir gives the hyperbola of zero contingency for 
this case, calculated on the basis of the correlation coefficient being ‘514. The means 


and standard deviations are :— 


FatherGue gee 67698, 27048, 
Son Se Le Ree pet 27321, 


and the equation to the hyperbola referred to the means as the origin is 
a” — 3°8522yx + ‘9801y? = 6°2510. 


The shaded squares are those of positive contingency. It will be seen that the 
hyperbola separates fairly well areas of positive, from areas of negative contingency. 
In most cases where there is an invasion across the boundary, the contingencies 
hardly differ from zero by amounts greater than the probable errors due to random 
sampling. 


Illustration B.—Data from Colour Inheritance in Greyhounds. 


In the previous example we have dealt with material in which contingency methods 
were directly comparable as to result with the correlation found by the “best” or 
product method process. In this illustration I deal with matter which can only 
provide a correlation to be found by the fourfold division process for comparison with 
the contingency coefficients. The data from which this illustration is drawn were 
extracted by Miss A. Barrineron from the ‘Greyhound Studbook.’ We deal with 
the inheritance of red and black pigments in the coat colour. I have selected six 
cases of the resemblance of brethren from different litters to compare the methods on. 
Tables were formed giving 16 to 25 contingency sub-groups of varying degrees of 
pigment, and these were worked out (a) by Miss Barrineron herself for the mean 
square contingency, (b) by myself for the mean contingency, and (c) by Dr. A. LEE 


* See ‘ Phil. Trans.,’ A, vol. 195, p. 42. The values range from *521 to 594, or almost the same range 
as we obtain from the mean contingency results. 
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for the fourfold correlation results. The results reached are given in the accompanying 
table. It is desirable to state that the number dealt with was about 1000 pairs of 
brethren in each case. 


TABLE V.—Fraternal Resemblance of Greyhounds from Ditterent Litters. 


C;, Mean Square | C2, Mean - : | 
| Character. Contingency. | Contingency. 7, Fourfold Table. 
Se a 

Red in brothers ‘478 | *695 “456 

» sisters eee +528 | °612 620 

) », sister and brother "488 | *615 | "450 
| Black in brothers . : 512 °615 *D58 
| Rees ee ee 482 632 "552 
| sister and brother. 502 | -622 | -593 
| ih eS Aen ce ea | *498 °632 | pos 
| Mean deviation from mean . . ‘016 *032 *057 


We see at once from this table that the method of mean square contingency gives 
far more uniform results than either the mean contingency method or the fourfold 
division method. The average given by it is close to what we have found for 
fraternal resemblance, 7.¢., ‘5, in other cases, and within fairly close limits, all six 
cases now give ‘5. The mean contingency gives results more divergent among 
themselves, but less so than those of the fourfold division method; their average, 
however, diverges most from what we have found in other cases. 

The lesson, I think, to be learnt from this is: That the mean square contingency 
coefficient, although more laborious to find, is better than the mean contingency 
coefhicient. That even with only 16 to 25 contingency sub-groups we may deduce 
results comparable with those obtained by fourfold divisions. But that it is probably 
always necessary to check a series by a certain number of fourfold division workings, 
for such are the only test that we have not got too crude a grouping reducing the 
contingency coefficient below the correlation value, or too fine a grouping introducing 
the difficulty already referred to (see p. 16), of magnifying the contingency coefficient 
owing to anomalous units. 


Illustration C.—Hair Colour in Man. 


I take the subject of hair colour because it is one in which doubts have been raised 
as to the order of pigments in a scale. 
The following table gives the resemblance of pairs of brothers in hair colour :— 
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TaBLe VI. 
e semper oa fone Es or 
First Brother. 
— Totals. 
| | 
Red. Fair. Brown. Dark. | Jet Black. | 
= a er rae oes rar ) Se oni 
fo Set Ji, eee et 23 16 12 == pipe saleS 
| SBS Ha BRABANT... | tedlin eames 416 158 C7 a "25 665 
| VS" Ob Brown 5 - "ieee 16 158 394 | 98°25 8°25 674°5 
DB | Sarit 24s inet 12 67°75 98°25 | 328°5 19 525°5 
| web-Black “2 5. eee 17 tars ee ee Nees ¥ 37-5 
. . 
Se So Kae 
‘Lotalgs, anon 81:5 665 67475 wt 3025 °5 37°5 | 1984 
| | 


The correlation found by taking the mean of four four-fold table divisions was ‘621.* 

This result is based on the above scale order. We will now see what difference will 
arise if we work by contingency, so that the scale order is absolutely indifferent, e.g., 
red might follow jet black. 


We find 
$2 = *603896, 


and accordingly C, = ‘614, a result within the limits of the probable error identical 
with the value of 7 found from the four-fold division method. 

This illustration confirms the opinion I have already expressed, i.e., that if the 
contingency be calculated for 16 to 36 sub-groups we shall obtain by the method of 
mean square contingency satisfactory results, z.e., values close to the coefficient of 
correlation as found by product moment or four-fold division methods. In this case, 
as in others, I find the mean contingency far inferior to the mean square contingency. 

My experience seems to show that about 25 sub-groups is the distribution to be 
aimed at; 9 is too few. Thus I worked out the relationship of temper in sisters for 
three-fold division—sullen, good-tempered, quick-tempered—or for 9 sub-groups. 
The method of mean contingency gave ‘44 and of mean squared contingency °36. 
Both far too small, as I find from each of four four-fold divisions a result of about °5. 


Illustration D.—On Occupational or Professional Correlation between Relatives. 


I take as a final illustration a case in which any idea of scale is practically 
inconceivable, and yet one in which it is of considerable interest to measure the 
deviation from independent probability. It belongs to a class of problems in which I 
hope this new method of contingency will be fruitful of result. In classifying men 
into occupational and professional groups, we clearly cannot do so on the basis of any 


* «Huxley Memorial Lecture,” ‘ Journal of Anthropological Institute,’ vol. 33, pp. 197 and 215. 
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scale which will put the army, church, and bar in any special order. On the other 
hand, it becomes of special interest to determine how far tastes and preferences for 
particular callings in life run in families. Miss Eminy Perrin has undertaken a 
lengthy investigation of this kind, and has provided me with the pure contingency 
table given as Table VII. The occupations of 775 fathers and sons are here classed 
in broad general groups, which can be arranged purely alphabetically. More minute 
divisions and data for other series of relatives will be published later by Miss Perri, 
and it is not my present purpose to anticipate her conclusions, but merely to suggest 
the, valuable applications which may be made of the novel methods to pure 
contingency results. What is the numerical measure of the relationship in pursuit 
between father and son, and how far is it removed from a mere chance relationship ? 


Taste VII.—Contingency between Occupations of Fathers and Sons. 


Occupation of Son. 
| : 
eS ‘ a} 
EE a 8/5 
Nature of occupation. gays s | zB , . a | 
2 FA = ae Hlolg iad 
io 6 = Pa _ no 
. | a = = | = 5 o a 2 BS a 
e (se |S} / 8/18 eile) ae wets sels 
coe St a: 5 3 rl 2 D is rd SR me) 
wees Ole ees toi Ofte tS elo fe 
=r U — 
Army. 98 —— | 14 | | | — | ese Le bo 2 BO 
xt eee sate Po: OL 1 2);—}|— 1 2 -- 1 1 62 
Teacher, Clerk,) ai takes l a 7 | 54 
ee ea ek il) Beat 2 | 
# | Crafts — |,;12;) — 6; 5;—);— | 1 7 1/ 2}—;—j 10 | 44 
S | Divinity . 5 5 2 1| 54;—j|—)| 6 9 Aili l 3a) oS 1 13 {115 
g, | Agriculture . —j| 2 i eat | ae ae 2) | 26 
4; | Landownership | 17 | 1 4-}—/|14}—| 6;11} 4] 1) 38) 3) 17 7 | 88 
= | Law 3|-9 6 };—]| 6| — 2) 18 | 13 1 1 1 8 i) 69 
-< | Literature eae 1 7G es a ie er al a el (es 4 | 19 
S, | Commerce 12 | 16 4 1}15|)—|—| 5} 13); 11) 6) 1) 7) 1 {106 
5 | Medicine. ey ia be! 2 1)/—|—|—j| 3|—|20;—j| 5 6 | 41 
Se Naty ot hy 3 Pp fay ry ay ry 6] 2) hf | 18 
pole: ia, cc ee a en ed a ea a | 
Court % .. | | | | 
ae epee oe ey ot GGT ee ee ee ee ey 
and Science . 
) | | 
| Totals. | 84 |108 37 | 11 |122 1 | 15 | 64 | 69 | 24 | 57 | 23 | 74 . "860 775 


Miss Perri has extracted this first series from the ‘Dictionary of National 
Biography’ ; hence she has, as a rule, tabled the distinguished, or at least moderately 
distinguished, sons of less distinguished fathers. It is, for example, not easy to win 
any form of distinction in agriculture. For this reason the distribution of occupations 

E 
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for sons differs widely from that of the occupations for fathers. There has accord- 
ingly been selection of the second generation, which undoubtedly must influence the 
result, z.e., tend to weaken the observed relationship. 

- Working out the 196 contingencies, squaring, dividing by the independent 
probability frequencies, summing and averaging, I find for the mean square 
contingency 


ob? = 1:299206, 
whence 


$°/(1 + $°) = °393794, 


and the coefficient of mean square contingency = 6275. This would correspond to 
the correlation in occupation between father and son. Now if occupation were settled 
solely by fitness or taste, and these characters were inherited as other human faculties, 
we should expect the correlation between father and son to be about ‘46.* Or, 
roughly, the hereditary relationship is increased by about 4 in the matter of 
occupation, Remembering what we have noted as to selection above, the real 
increment is probably somewhat larger than this. Roughly, however, we may 
conclude from Miss Prrrtn’s data that about ? of the observed resemblance in 
occupation between father and son is due to hereditary influences, and the remaining 
+ to environmental effect. These numbers are subject to revision when Miss PERRIN’S 
data are more ample and have been more fully analysed and discussed. 


(12.) General Conclusions. 


The general conception of contingency developed in this memoir I consider in the 
first place of theoretical importance. Its practical applications are not negligible, but 
are, for reasons given below, of less importance than might @ priori be supposed. 

(a.) In the first place, the conception of contingency enables us at once to generalise 
the notion of the association of two attributes developed by Mr. Yute. We can class 
individuals not into two alternate groups, but into as many groups with exclusive 
attributes as we please, and either the mean contingency or the mean square 
contingency will enable us to see the extent to which two such systems are contingent 
or non-contingent. : 

(s.) This result enables us to start from the mathematical theory of independent 
probability as developed in the elementary text books, and build up from it a 
generalised theory of association, or, as I term it, contingency. We reach the notion 
of a pure contingency table, in which the order of the sub-groups is of no importance 
whatever. 

(c.) We then investigate the relation of contingency to normal correlation, and 
find that with normal frequency distributions both contingency coefficients pass with 
sufficiently fine grouping into the well-known correlation coefficient. Since, however, 


* ¢ Biometrika,’ vol. 2, p. 379. 
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the contingency is independent of the order of grouping, we conclude that when we 
are dealing with alternative and exclusive sub-attributes, we need not insist on the 
importance of any particular order or scale for the arrangement of the sub-groups. 

(p.) This conception can be extended from normal correlation to any distribution 
with linear regression ; small changes (7.c., such that the sum of their squares may 
be neglected as compared with the square of mean or standard deviation) may be 
made in the order of grouping without affecting the correlation coefticient. 

(c.) The results (c) and (D) are not so fruitful for practical working as might at 
first sight appear, for they depend in practice on the legitimacy of replacing finite 
integrals by sums over a series of varying areas, where no quadrature formula is 
available. If we, to meet the difficulty, make a very great number of small classes, 
the calculation, especially of the mean square contingency, becomes excessively 
laborious. Further, since in observation individuals go by units, casual individuals, 
which may fairly represent the total frequency of a considerable area, will be found 
on some one or other isolated small area, and thus increase out of all proportion the 
contingency. The like difficulty occurs when we deal with outlying individuals in 
the case of frequency curves, only it is immensely exaggerated in the case of 
frequency surfaces. 

(r.) It is thus not desirable in actual practice to take too many or too fine sub- 
groupings. It is found, under these conditions, that the correlation coefficient as 
determined by the product moment or fourfold division methods is approximated to 
more closely in the case of the contingency coefficient found from mean square 
contingency than in the case of that found from mean contingency. Probably 
16 to 25 contingency sub-groups will give fairly good results in the case of mean 
square contingency, but for each particular type of investigation it appears desirable 
to check the number of groups proper for the purpose by comparison with the results 
of test fourfold division correlations. Under such conditions it appears likely that 
very steady and consistent results will be obtained from mean square contingency. 

(a.) Finally, contingency may be applied—of course, at first tentatively and with 
caution—in the consideration of a whole class of problems in which no attempt at a 
scale or order of sub-groups is possible, in short, where alphabetical order is as good 
as any other. For example, it would seem to be available in a vast range of problems 
of exclusive and alternative inheritance. 
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Diagram II. Illustrating areas of positive and negative Contingency and the Hyperbola of Zero- 
Contingency. 


Stature of Father. 


XD UxD<| a 


as 


63°5 


Stature of Son. 


Sub-groups with plus contingency marked thus: |. 
Sub-groups within the hyperbolic area, where there is no frequency in the observations, and which 


must therefore give negative contingency, marked thus : DX] 


N.B.—Owing to an oversight on the part of the engraver, the absolute squareness of the elements in the 
original drawing has been disregarded in this reproduction. 
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